Unstructured Datasets Analysis: Thesaurus Model
نویسندگان
چکیده
منابع مشابه
Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CT(R)). The approach was implemented using sample datasets from fMRIDC, ...
متن کاملBuilding Queryable Datasets from Ungrammatical and Unstructured Sources
For agents to act on behalf of users, they will have to query the vast amounts of textual data on the internet. However, much of this text cannot be queried because it is neither grammatical nor formally structured enough to support traditional information extraction approaches to annotation. Examples of such text, called “posts,” include item descriptions on Ebay or internet classifieds like C...
متن کاملA hybrid spam detection method based on unstructured datasets
The identification of non-genuine or malicious messages poses a variety of challenges due to the continuous changes in the techniques utilised by cyber-criminals. In this article, we propose a hybrid detection method based on a combination of image and text spam recognition techniques. In particular, the former is based on sparse representation based classification, which focuses on the global ...
متن کاملNonparametric Regression Estimation under Kernel Polynomial Model for Unstructured Data
The nonparametric estimation(NE) of kernel polynomial regression (KPR) model is a powerful tool to visually depict the effect of covariates on response variable, when there exist unstructured and heterogeneous data. In this paper we introduce KPR model that is the mixture of nonparametric regression models with bootstrap algorithm, which is considered in a heterogeneous and unstructured framewo...
متن کاملMining Patterns from Clinical Trial Annotated Datasets by Exploiting the NCI Thesaurus
Annotations of clinical trials with controlled vocabularies of drugs and diseases, encode scientific knowledge that can be mined to discover relationships between scientific concepts. We present PAnG (Patterns in Annotation Graphs), a tool that relies on dense subgraphs, graph summarization and taxonomic distance metrics, computed using the NCI Thesaurus, to identify patterns.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer Applications Technology and Research
سال: 2017
ISSN: 2319-8656
DOI: 10.7753/ijcatr0604.1007